In [34]:
import math
import random
from collections import Counter
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
Probabilities are a way of quantifying the possibility of the occurrence of a specific event or events given the set of all possible events.
Notationally, $P(E)$ means "the probability of event $E$."
Events $E$ and $F$ are dependent if information about $E$ gives us information about the probability of $F$ occurring (or vice versa). If this is not the case, the variables are independent of each other.
For independent events, the probability of both occurring is the product of the probabilities of each occurring: $$P(E, F) = P(E)P(F)$$
If events are not independent, we can express conditional probability ($E$ is conditional on $F$ or what is the probability that $E$ happens given that $F$ happens): $$P(E\ |\ F) = P(E, F)\ /\ P(F)$$ which (if $E$ and $F$ are dependent) can be written as $$P(E, F) = P(E\ |\ F)P(F)$$
When $E$ and $F$ are independent: $$P(E\ |\ F) = P(E)$$
Conditional probabilities can be "reversed": $$P(E \text{ | } F) = P(E, F) \text{ / } P(F) = P(F \text{ | } E)P(E) \text{ / } P(F)$$ If $E$ doesn't happen: $$P(F) = P(F, E) + P(F, \neg E)$$
Leads to Bayes's Theorem: $$P(E\ |\ F) = P(F\ |\ E)P(E)\ /\ [P(F\ |\ E)P(E) + P(F\ |\ \neg E)P(\ \neg E)]$$
Coin flips represent a discrete distribution, i.e., one that takes on mutually exclusive values with no "in-betweens." A continuous distribution is one that allows for a full range of values along a continuum such as height or weight.
Continuous distributions use a probability density function (pdf) to define probability of a value within a given range.
The pdf for the uniform distribution is:
In [1]:
def uniform_pdf(x):
return 1 if x >= 0 and x < 1 else 0
In [22]:
xs = np.arange(-1, 2, .001)
ys = [uniform_pdf(x) for x in xs]
plt.plot(xs, ys);
In [15]:
uniform_pdf(-0.01)
Out[15]:
The cumulative distribution function (cdf) gives the probability that a random variable is less than or equal to a certain value.
In [11]:
def uniform_cdf(x):
"""Returns probability that a value is <= x"""
if x < 0: return 0
elif x < 1: return x
else: return 1
In [21]:
xs = np.arange(-1, 2, .001)
ys = [uniform_cdf(x) for x in xs]
plt.step(xs, ys);
The normal distribution is the definitive example of a random distribution (the classic bell curve shape). It is defined by two parameters: the mean $\mu$ and the standard deviation $\sigma$.
The function for the distribution is: $$f(x\ |\ \mu, \sigma) = \frac{1}{\sqrt{2\pi}\sigma}\ exp\bigg(-\frac{(x - \mu)^2}{2\sigma^2}\bigg)$$
In [23]:
def normal_pdf(x, mu=0, sigma=1):
sqrt_two_pi = math.sqrt(2 * math.pi)
return (math.exp(-(x - mu)**2 / 2 / sigma**2) / (sqrt_two_pi * sigma))
In [26]:
xs = [x / 10.0 for x in range(-50, 50)]
plt.plot(xs, [normal_pdf(x, sigma=1) for x in xs],'-',label='mu=0,sigma=1')
plt.plot(xs, [normal_pdf(x, sigma=2) for x in xs],'--',label='mu=0,sigma=2')
plt.plot(xs, [normal_pdf(x, sigma=0.5) for x in xs],':',label='mu=0,sigma=0.5')
plt.plot(xs, [normal_pdf(x, mu=-1) for x in xs],'-.',label='mu=-1,sigma=1')
plt.legend()
plt.title("Various Normal pdfs")
plt.show()
When $\mu = 0$ and $\sigma = 1$ we call a distribution the standard normal distribution.
In [27]:
def normal_cdf(x, mu=0, sigma=1):
return (1 + math.erf((x - mu) / math.sqrt(2) / sigma)) / 2
plt.plot(xs, [normal_cdf(x, sigma=1) for x in xs],'-',label='mu=0,sigma=1')
plt.plot(xs, [normal_cdf(x, sigma=2) for x in xs],'--',label='mu=0,sigma=2')
plt.plot(xs, [normal_cdf(x, sigma=0.5) for x in xs],':',label='mu=0,sigma=0.5')
plt.plot(xs, [normal_cdf(x, mu=-1) for x in xs],'-.',label='mu=-1,sigma=1')
plt.legend(loc=4) # bottom right
plt.title("Various Normal cdfs")
plt.show()
In [28]:
def inverse_normal_cdf(p, mu=0, sigma=1, tolerance=0.00001):
"""find approimate inverse using binary search"""
# if not standard, compute standard and rescale
if mu!= 0 or sigma != 1:
return mu + sigma * inverse_normal_cdf(p, tolerance=tolerance)
low_z, low_p = -10.0, 0
hi_z, hi_p = 10.0, 1
while hi_z - low_z > tolerance:
mid_z = (low_z + hi_z) / 2
mid_p = normal_cdf(mid_z)
if mid_p < p:
low_z, low_p = mid_z, mid_p
elif mid_p > p:
hi_z, hi_p = mid_z, mid_p
else:
break
return mid_z
The central limit theorem states that the average of a large number of independent and identically distributed random variables is itself normally distributed.
So, if $x_1, ..., x_n$ are random and they have mean $\mu$ and standard deviation $\sigma$, then $\frac{1}{n}(x_1 +\ ...\ + x_n)$ will be normally distributed. An equivalent expression is $\frac{(x_1 +\ ...\ + x_n)\ -\ \mu n}{\sigma \sqrt{n}}$
A binomial random variable (Binomial(n, p)) is the sum of $n$ independent Bernoulli (Bernoulli(p)) random variables. Each of the variables equals 1 with a probability of $p$ and equals 0 with a probability of $1 - p$.
In [29]:
def bernoulli_trial(p):
return 1 if random.random() < p else 0
def binomial(n, p):
return sum(bernoulli_trial(p) for _ in range(n))
The mean of a Bernoulli(p) variable is $p$ and its standard deviation is $\sqrt{p(1 - p)}$.
In [30]:
def plot_binomial(p, n, num_points):
data = [binomial(n, p) for _ in range(num_points)]
histogram = Counter(data)
plt.bar([x - 0.04 for x in histogram.keys()],
[v / num_points for v in histogram.values()],
0.8,
color='0.75')
mu = p * n
sigma = math.sqrt(n * p * (1 - p))
xs = range(min(data), max(data) + 1)
ys = [normal_cdf(i + 0.5, mu, sigma) - normal_cdf(i - 0.5, mu, sigma) for i in xs]
plt.plot(xs, ys)
plt.title('Binomial Distribution vs. Normal Approximation')
In [35]:
plot_binomial(0.75, 100, 10000)
In [ ]: